Goto

Collaborating Authors

 unrelated concept


When Are Concepts Erased From Diffusion Models?

Lu, Kevin, Kriplani, Nicky, Gandikota, Rohit, Pham, Minh, Bau, David, Hegde, Chinmay, Cohen, Niv

arXiv.org Artificial Intelligence

In concept erasure, a model is modified to selectively prevent it from generating a target concept. Despite the rapid development of new methods, it remains unclear how thoroughly these approaches remove the target concept from the model. We begin by proposing two conceptual models for the erasure mechanism in diffusion models: (i) interfering with the model's internal guidance processes, and (ii) reducing the unconditional likelihood of generating the target concept, potentially removing it entirely. To assess whether a concept has been truly erased from the model, we introduce a comprehensive suite of independent probing techniques: supplying visual context, modifying the diffusion trajectory, applying classifier guidance, and analyzing the model's alternative generations that emerge in place of the erased concept. Our results shed light on the value of exploring concept erasure robustness outside of adversarial text inputs, and emphasize the importance of comprehensive evaluations for erasure in diffusion models. Our code, data, and results are available at unerasing.baulab.info.


VideoEraser: Concept Erasure in Text-to-Video Diffusion Models

Xu, Naen, Zhang, Jinghuai, Li, Changjiang, Chen, Zhi, Zhou, Chunyi, Li, Qingming, Du, Tianyu, Ji, Shouling

arXiv.org Artificial Intelligence

The rapid growth of text-to-video (T2V) diffusion models has raised concerns about privacy, copyright, and safety due to their potential misuse in generating harmful or misleading content. These models are often trained on numerous datasets, including unauthorized personal identities, artistic creations, and harmful materials, which can lead to uncontrolled production and distribution of such content. To address this, we propose VideoEraser, a training-free framework that prevents T2V diffusion models from generating videos with undesirable concepts, even when explicitly prompted with those concepts. Designed as a plug-and-play module, VideoEraser can seamlessly integrate with representative T2V diffusion models via a two-stage process: Selective Prompt Embedding Adjustment (SPEA) and Adversarial-Resilient Noise Guidance (ARNG). We conduct extensive evaluations across four tasks, including object erasure, artistic style erasure, celebrity erasure, and explicit content erasure. Experimental results show that VideoEraser consistently outperforms prior methods regarding efficacy, integrity, fidelity, robustness, and generalizability. Notably, VideoEraser achieves state-of-the-art performance in suppressing undesirable content during T2V generation, reducing it by 46% on average across four tasks compared to baselines.


CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models

Xue, Yuyang, Moroshko, Edward, Chen, Feng, McDonagh, Steven, Tsaftaris, Sotirios A.

arXiv.org Artificial Intelligence

Text-to-Image diffusion models can produce undesirable content that necessitates concept erasure techniques. However, existing methods struggle with under-erasure, leaving residual traces of targeted concepts, or over-erasure, mistakenly eliminating unrelated but visually similar concepts. To address these limitations, we introduce CRCE, a novel concept erasure framework that leverages Large Language Models to identify both semantically related concepts that should be erased alongside the target and distinct concepts that should be preserved. By explicitly modeling coreferential and retained concepts semantically, CRCE enables more precise concept removal, without unintended erasure. Experiments demonstrate that CRCE outperforms existing methods on diverse erasure tasks.


Can Artificial Intelligence Systems like DALL-E or Midjourney Perform Creative Tasks?

#artificialintelligence

Recently we are witnessing a major shift in the process of generating images. The recent influx and growth of machine learning and artificial intelligence rises questions about the way in which creative processes evolve and develop through technology. Systems like DALL-E, DALL-E 2 and Midjourney are AI programs trained to generate images from text descriptions using a dataset of text-image pairs. The diverse set of capabilities includes creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, and applying transformations to existing images. DALL-E and similar systems are able to create plausible images for a great variety of sentences that explore the compositional structure of language.


DALL·E: Creating Images from Text

#artificialintelligence

DALL·E[1] is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions, using a dataset of text–image pairs. We've found that it has a diverse set of capabilities, including creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, rendering text, and applying transformations to existing images. GPT-3 showed that language can be used to instruct a large neural network to perform a variety of text generation tasks. Image GPT showed that the same type of neural network can also be used to generate images with high fidelity. We extend these findings to show that manipulating visual concepts through language is now within reach.